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ABSTRACT 



Context. The majority of stars form in clusters. Therefore a comprehensive view of star formation requires understanding the initial 
conditions for cluster formation. 

Aims. The goal of our study is to shed light on the physical properties of infrared dark clouds (IRDCs) and the role they play in 
the formation of stellar clusters. This article, the first of a series dedicated to the study of IRDCs, describes techniques developed to 
establish a complete catalogue of Spitzer IRDCs in the Galaxy. 

Methods. We have analysed Spitzer GLIMPSE and MIPSGAL data to identify a complete sample of IRDCs in the region of Galactic 
longitude and latitude 10° < |/| < 65° and \b\ < 1°. From the 8/im observations we have constructed opacity maps and used a newly 
developed extraction algorithm to identify structures above a column density of N H , i 1 x 10 22 cnr 2 . The 24/im data are then used to 
characterize the star formation activity of each extracted cloud. 

Results. A total of 11303 clouds have been extracted. A comparison with the existing MSX based catalogue of IRDCs shows that 
80% of these Spitzer dark clouds were previously unknown. The algorithm also extracts ~ 20000 to 50000 fragments within these 
clouds, depending on detection threshold used. A first look at the MIPSGAL data indicates that between 20% and 68% of these IRDCs 
show 24yum point-like association.This new database provides an important resource for future studies aiming to understand the initial 
conditions of star formation in the Galaxy. 

Key words. Catalogs; Stars: formation; ISM: clouds 



1. Introduction 

The majority of stars fo rm in groups from few tens to few hun- 
dreds of objects (e.g. lLada & Lada1 l2003). So, understanding 
cluster formation is key to understanding the formation of stars. 
Clusters form from the gas located in the den sest parts of molec- 
ular clouds, within structures called clumps dBlitzll 1993b . These 
clumps fragment into an assembly of protostellar cores which 
collapse to produce stars, forming 'protoclusters'. By definition, 
protoclusters are active s tar forming regio ns, with jets, flows 
and heating sources (e.g. iBallv et all 2006) which rapidly start 
to shape their surroundings. From the study of these protoclus- 
ters, it is therefore difficult to back track to the initial conditions 
of their formation. On the other hand, clumps which are on the 
verge of forming protostars, but which have not formed any yet, 
are structures unpolluted by star formation activity and must still 
reflect the initial conditions of the formation of protoclusters. 
Looking for, and studying such 'pre-protoclusters' is crucial for 
our understanding of star formation processes. 

Only a tiny percentage of the material in any molecular 
cloud forms stars. These star-forming regions are traced by var- 
ious signposts of star formation activity such as the presence 
of strong infrared sources, outflows, jets, methanol and water 
masers and compact HII regions. The problem with identify- 
ing pre-protoclusters is that by definition these signposts are not 
yet present. Other means are thus necessary to find such ob- 
jects. The two infrared satellites ISO and MSX have been im- 



portant tools for this purpose. Indeed, the large infrared sur- 
veys these satellites carried out identified infrared dark struc- 
tures, seen in abso rption from 7 to 25 urn against the back- 
ground em ission (Perault et a l.l Il996t iHennebel le et alJ 2001; 
lEgan et all |1998t ISimon et all l2006al) . Millimete r molecular 
lines (e.g. (Carey et all 1 19981 ITevssier et all 120021; iPillai et al 
2006) and dust continuum observations (e.g. ITevssier et al 



l2002h iRathborne et al.l |2006j) have clearly demonstrated that 
these infrared dark clouds are dense, co ld structures, possi - 
bly being the progenito rs of protoclusters (Simon et al. 2006b). 
Rath borne et al.l d2006l) even suggested that the dust continuum 
"cores" observed in these IRDCs are the direct progenitors of 
massive stars. However, the wide range of mass and size of these 
IRDCs clearly suggests that they cannot all be evolving along the 
same evolutionary path and they must lead to the formation of a 
large range of different stellar contents. 

So far, the study of the earliest stages of the forma- 
tion of protoclusters ha ve mostly focussed on the closest ob - 



jects suc h as p-Oph(e.g iMotte et all [19981: lAndre etaill2007l) . 
Perse us iHatchefl et all |200H lEnoch et al.l 120061) , NGC2264 
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(e.g. iPeretto et all l2006t iTeixeira et all 120061) . The results of 
these studies set important constraints on models of star for- 
mation, but may not be representative of the formation of stars 
throughout the Galaxy. The only way to define such a represen- 
tative view is through studies of large unbiased samples of the 
precursors of stellar clusters. 

In this paper we identify and characterise the IRDCs detected 
using the Spitzer GLIMPSE and MIPSGAL archive data. The 
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Fig. 1. These images show the GLIMPSE Spitzer 8pm emission of 3 random IRDCs from our sample. These illustrate the diversity 
in shape and size of IRDCs. 



high angular resolution of the Spitzer data provides a detailed 
probe of the structure of these sources while the high sensitiv- 
ity of IRAC and MIPS allows us to detect previously unseen 
deeply embedded protostars/protoclusters. Section 2 of this pa- 
per presents the Spitzer archive data used for this study. Section 
3 will discuss the construction of 8pm opacity maps for IRDCs, 
while Section 4 will focus on the conversion from 8pm opac- 
ity to H2 column density. The extraction of structures within 
these maps will be discussed in Section 5. A comparison with 
the MSX catalogue of IRDCs is in Section 6 while Section 7 
summarizes our initial study. The nature of these dark clouds 
and their star formation actively are discussed in more detail in 
subsequent papers (Peretto & Fuller, in preparation). 

2. A large survey of infrared dark clouds: Spitzer 
archive data 

IRDCs are seen in silhouette against the infrared background 
emission (see Fig. [TJ and as a sample are likely to con- 
tain protoclusters and pre-protoclusters. Even when large scale 
(sub)millimetre surveys of the Galactic plane become avail- 
able and these objects can be detected through their dust emis- 
sion, IRDCs and studies of the absorption towards these sources 
will remain important. Not only can the IRDCs be studied at 
high angular resolution at infrared wavelengths, but unlike the 
(sub)millimetre emission, their column density can be measured 
from the absorption independent of the dust temperature. 

The firs t large survey of IRDCs was undertaken by 
Simo n et all (l2006al) using the mid-infrared data of the MSX 
satellite. In total, Simon et al. detected more than 10000 IRDCs, 
with sizes larger than (36") 2 and flux density more than 2 MJy/sr 
(> 2 times the rms noise of the MSX images) below the mid- 
infrared radiation field. W ithin these IRDCs the y extracted more 
than 12000 IRDC "cores" ISimon et al l (l2006bl) performed a fol- 
low up of a sub-sample of few hundreds sources for which they 
were able to determine distances. They found that these I RDCs 
are very similar to CO molecular clumps (e.g. lBlitall993h . 

In the GLIMPSE and MIPSGAL surveys the Spitzer satellite 
has resurveyed a large fraction of the Galactic plane at infrared 
wavelengths (10° < |Z| < 65°, < \b\ < 1°). These data have both 
better angular resolution (2" vs 20" at 8pm ) and sensitivity 
(0.3 MJy/sr vs 1.2 MJy/sr at 8pm ) than the MSX data, as well 
as wider wavelength coverage. The IRAC (3.6, 4.5, 5.8, 8 pm) 
GLIMPSE and MIPS (24, 70, 160 pm) MIPSGAL observations 



provide a unique opportunity to shed light on the role of IRDCs 
during the earliest stages of star formation. Despite a smaller 
coverage of the Galactic plane by Spitzer, an initial comparison 
of the MSX IRDC catalogues with the Spitzer observations indi- 
cated that the Spitzer data contained IRDCs undetected by MSX 
in the same region of the Galaxy. Therefore an unbiased search 
of the Spitzer GLIMPSE data has been undertaken to identify 
IRDCs. 

Many IRDCs can been seen in silhouette up to at least 24 pm, 
providing a wide wavelength range over which they can be stud- 
ied in absorption. However several factors affect the choice of 
the optimal wavelength at which to identify and study the over- 
all cloud properties. These include the strength and uniformity 
of the background emission and the number of foreground and 
background stars and in principle, the wavelength dependence of 
the dust extinction law, although recent work suggests that from 
4.5 to 8pm, the three last bands observed by Spitzer /IRAC, the 
extinction is a relatively flat function of wavelength (Lutz et alj 
119961; llndebetouw et alj|2005t iRoman-Zufiiga et al.ll2007l) . The 
angular resolution of the observations is highest at the shortest 
wavelengths, but in these bands a very high density of stars is de- 
tected and high degree of structure in the relatively weak back- 
ground emission makes analysis of the images at these wave- 
lengths complex. Overall, inspection of the Spitzer data shows 
that the strength and relative smoothness of the background 
emission together with the relatively low density of stars make 
the IRAC 8 pm band the most suitable for this initial study of a 
large sample of objects. 

The GLIMPSE and MIPSGAL data have been reduced and 
calibrated automatically to produce the so called post-Basic 
Calibrated Data (post BCD). The ty pical flux uncert ainty for 
point-like sources is ~ 2% at 8pm (iReach et al.l 120051) while 
the position uncertainty is less than 0.3"(IRAC manual V8.0: 
http://ssc.spitzer.caltech.edu/documents/SOM/). However, be- 
cause we are not looking at point-like sources but extended 
objects, a cali bration factor ha s to be applied on the PBCD 
8pm images dReach et al.ll2.005b . This calibration factor, CF, is 
a function of the aperture radius, R fl , for the source under in- 
vestigation (http://ssc.spitzer.caltech.edu/irac/calib/extcal/). The 
relation between CF and R a in arcseconds, at 8pm is CF = 
1.37xexp(-/?" 33 )+0.74. Because the typical size of the structure 
we analyse is about one arcminute, in the analysis which follows 
we applied a calibration factor of 0.8 to the PBCD 8pm images. 
A different calibration factor would not change the opacities of 
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Fig. 2. Schematic view a typical IRDC flux density profile. The 
variable meanings used in the rest of the text are illustrated on 
this figure. In this figure, 7f ole has been set to a particular value, 
e.g., 38 MJy/sr, but in practice, it can be anywhere between I z \ 
and 7 min . 
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Fig. 3. Calculated opacity profiles of the IRDC plotted in Figj2] 
corresponding to 3 different assumptions on the foreground in- 
tensity. The solid line shows 7f ore = 7 z i (i.e /fore = 0.25 x/ m i n ), the 
dotted line 7f ole = 0.7 x 7 m ; n and the dashed line 7f ore = 0.9 x 7 m ; n 



the IRDCs we calculated, but would imply different related in- 
tensities (Table Q]). 

3. Opacity distribution of IRDCs 

3.1. Principle 

Infrared dark clouds are structures seen in absorption against the 
background emission. The strength of the absorption is directly 
related to the opacity a l ong th e line of sight. Following the nota- 
tion of iBacmann et alj (|2000), the relation between the opacity 
Tx and the intensity at wavelength at A, emerging from the cloud 
I a, is given by 

x exp(-T 8/im ) + (i) 

where /bg-8um is the intensity of the background emission at 
8/jm, and /f 0re -8/im is the foreground emission. In the following 
for simplicity we drop the 8/mi label on the variable names, ex- 
cept on the opacity. If we know the foreground and background 
intensities we can invert Eq. and infer the spatial distribution 
of the opacity within an infrared dark cloud, 

Tg^m = - In | 7 /b /f0re j (2) 

/fore and /be are related to each other by /mir = hg + Ifore 
where /mir is the observed mid-infrared radiation field and can 
be estimated directly from the 8yum images (see Fig. [2j. A lower 
limit on /f ore is given by the intensity of the zodiacal light, I z \, 
in the direction of the cloud, while an upper limit is given by 
the minimum intensity within the cloud, / m ; n . However, with the 
extinction data only, it is impossible to find the exact value of 
/fore for a given cloud. 

The determination of /f 0le is crucial to infer the spatial opac- 
ity distribution of a given IRDC. To illustrate this point, we com- 
puted the opacity of the cloud profile shown in Fig. [2] for three 
different values of /f ore (Fig. [3}. On this figure we see that, with 
increasing /f ore , the opacity increases significantly everywhere in 
the cloud, and even more sharply at the peak. These opacity vari- 
ations are even more drastic for shallower clouds. It is therefore 
important to constrain /f ore when calculating the opacity distri- 
bution of an IRDC. 



Of course it is also possible that at least some the IRDCs are 
saturated and their intensity profiles become flattened. In such 
cases, it becomes impossible to recover the central structure of 
the clouds through the extinction maps. Moreover, such flatten- 
ing could lead to an incorrect interpretation of the final opacity 
profiles of IRDCs. 

3.2. Constraining / foie 

Comparison of the infrared extinction and millimeter emis- 
sion can be used to constrain the infrared foreground emis- 
sion towards an IRDC by requiring that both techniques give 
the same column density towards the source. For this purpose 
we have used the 38 IRDC 1.2mm dust continuum images 
iRathborne et al.1 (120061) obtained with the IRAM 30m telescope 
at 1 1 "angular resolution. The 1.2mm emission can be translated 
into an 8^m opacity, T em , using the equation 

^peak X R K 
B\2{Td) X £230m 

where S pea k is the 1 2mm dust continuum emission peak of the 
source, R K is the specific dust opacity ratio between 8jt/m and 
1.2mm, fii.2(/rf) is the Planck function at 1.2mm for the dust 
temperature Tj, and Qqom is the solid angle at 1.2mm of the 
IRAM 30m telescope beam. The value for R K is not well con- 
strained: different models of dusts provide different values of R K . 
Given the chemical composition of the emitting/absorbing dust 
the value of R K can be as large a s 2000 for interstellar dust in 
diffuse clo uds (e.g.lDrai ne 2003), decreasing to 750 for dens e 
clouds (e.g.lOssenkopf & Hennin Jl994t|johnstone et alj |2003). 
Given the dense and cold nature of IRDCs, we adopted the value 
R K = 750, and a dust temperature of 15 K, which gives 

T em = 0.02 X S peak (4) 

with S peak in mJy/beam. After sm oothing the Spitzer 8^m im- 
ages of the 38 IRDCs observed by IRathborne et all d2006l) to the 
same resolution as the the IRAM 30m 1 2mm images, we have 
constructed their 8yum opacity maps assuming /f ore = I z \ (i.e. 
the lower limit on the foreground emission). A direct compari- 
son between these opacity maps and the ones calculated from the 



4 



N. Peretto and G. A. Fuller: The initial conditions of stellar protocluster formation 



o 

CL 

o 

E 
a. 
oo 



"O 

0) 



tn 

E 

E 
E 



— 1 1 1 — 1 1 1 1 

- L=15K 


i i 


T 1 1 


'fore = 'zl 










■ 










< T emAabs> = 
a = 1.1 


2.9 - 


it 


< ""^"em/'^"abs^ > — 

a = 3.8 


7.2 - 


■ 

A 

■ * A A 


A 

A'' 

£Ay' 




- 


\ ^^^^ 
, , i , . . . i . 





100 



0.5 1 1.5 2 

8yu.m opacity from extinction (r abs ) 

Fig. 4. Plot of the 8/im opacity estimated from the 8/im Spitzer 
maps (T abs ) and from the 1.2mm dust continuum emission (r em ). 
The starless sources are marked with open triangles while those 
associated with 24yum point-like emission are marked with red 
open star symbols. T abs has been calculated assuming 7f ole = l z \. 
The solid line marks the relationship T abs = r em , while the two 
dashed lines indicates r a b s = 0.5 x r em and r a b s = 2 x T em 



1.2mm dust continuum images becomes then possible. However 
the observations of the 8//m absorption and 1 .2mm emission are 
not equally sensitive to all of the dust along the line of sight. 
Regions of low column density are more easily detected in ab- 
sorption than in emission. For this reason, we selected only clear 
corresponding peaks in both type of images, ending up with 57 
"cores" (emission peaks and absorption minima) which have 
been used for the comparison. Amongst these cores 11 show 
24/mi point-like emission. Figure|4]shows the resulting compar- 
ison for these 57 cores, the "starless" ones (those without asso- 
ciated 24jum emission) are marked with open triangles while 
the "protostellar" ones are marked with red stars. Also shown 
are the three lines: r a b s = T em (solid line), r a b s = 2 x r em , and 
T a bs = 0.5 x r em (dashed lines). In the figure there is a clear 
separation between the starless sources and those objects asso- 
ciated with a 24/mi point-like source. For the sources associated 
with 24//m point-like emission, the values of T em are on aver- 
age higher than for the starless sources. The T em /r a b s ratio is on 
average ~ 2.9 for the starless sources with a dispersion of 1.1, 
while it is ~ 7.2 for the sources with stars with a dispersion of 
3.8. This reflects that the latter group of sources have stronger 
1.2mm emission (a factor of ~ 2.5 ), which translates to higher 
opacities for the same assumed dust temperature. This clearly 
shows these sources are in fact either warmer with average dust 
temperature greater than 15 K, or else have different dust proper- 
ties. On the other hand for the starless objects, the average ratio 
< ^em/Tabs >= 2.9 is closer, but still rather far from, unity. This 
suggests that the value of 7f ore is underestimated and the assump- 
tion /f ole = I z \ is incorrect. 

Assuming that for starless cores the true 8/im opacity is given 
by T em , we can invert Eq. (O to estimate the value of If ore in terms 
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Fig. 5. Plot of the 8/mi f oreground intens i ty calc ulated for 57 po- 
sitions (see text) of the Rat hborne et alj d2006l) sample in func- 
tion of the 8/mi mid-infrared radiation field estimated around 
them. The best linear fit is shown as a red solid line. 



of IyuR- We did such a calculation for every starless core and 
plotted the results in Fig. [5] Imir being measured at the position 
of the core on the large scale emission map (Sec. 3.3). A strong 
correlation is seen between If ore vs Imir- The best linear fit to this 
correlation is given by 



/to 



0.54 x /, 



MIR 



(5) 



with a standard deviation of 0.08, minimum and maximum val- 
ues of 0.4 and 0.75, respectively. This relationship allows us to 
compute an average foreground emission just by estimating the 
mid-infrared radiation for any IRDCs. Figure[6]shows r em versus 
r a bs calculated using Eq. ©, but only for the starless cores this 
time. Here < r em /r a b s >= 1.1 with a dispersion of only 0.5. 

The relation in Eq. (O gives us the maximum opacity (and 
equivalent column density) we can probe before reaching satu- 
ration. Indeed, the rms noise level of the 8/im images (cr no i se ~ 
0.3 MJy/sr) defines the minimum flux we can detect above the 
foreground emission. Below this value, the dust in the cloud is 
basically absorbing all the background emission and we cannot 
recover the true peak column density. This saturation opacity, 
r sat , is given by r sat = - ln(cr noise // bg ), with 7 bg = 0.46 x 7 M ir- 
The saturation opacity is calculated for every IRDC and given 
in Table [TJ We also note that w e have 7f ore — hg as also ob- 



served by Johnstone et al.l (120031) and this suggests that most of 
the foreground emission originates from the same place as the 
background emission and is local to the IRDC, and therefore the 
foreground emission is independent of distance to the IRDC. 

3.3. Construction of the opacity maps 

To construct opacity maps of IRDCs all over the Galactic plane 
we mosaiced the GLIMPSE 8yum and MIPSGAL 24//m images 
in blocks of 1° in longitude by 2° in latitude using the Montage 
software (http://montage.ipac.caltech.edu/). To allow the iden- 
tification of IRDCs which cross the edges of these blocks and 
to allow the extraction of regions large enough for our analysis 
around clouds near the edges of these blocks, each consecutive 
block overlaps adjacent blocks by 0.5°. In principle this means 
our extraction could miss IRDCs larger than about . 5° in si ze. 
However the largest cloud identified by [Simon etaL (2006a) is 
27' long. 



N. Peretto and G. A. Fuller: The initial conditions of stellar protocluster formation 



5 



CL 

O 



oo 



'fore _ 0-5 4xl MIR 
A 




12 3 4 

8/im opacity from extinction (T Qbs ) 

Fig. 6. Same as Fig. [4] but only for starless sources and with a 
8/mi opacity calculated with /fore = 0.54 x 7 m i n . The solid line 
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The sensitivity of the Spitzer images is such that signifi- 
cant numbers of stars and galaxies appear in them, even at 8/im. 
These need to be removed in order to produce clean mid-infrared 
images and opacity maps of the clouds. This has been done 
in two steps. First identifying the central position of stars in 
the field using the IDL FIND task from the Astronomy library. 
Second, the values in the pixels containing the star were replaced 
with values calculated from an average gradient plane fit to the 
values of the pixels surrounding the star we want to remove. 
While this allowed the recovery of some part of the structure 
of a cloud, it can also produce artifacts. 

Once the 8//m stars were removed, we calculated the mid- 
infrared radiation field /mir by smoothing each 8/mi block by a 
normalised Gaussian of FWHM=308'{B This size is a compro- 
mise between several parameters: the typical size of an IRDC, 
the typical spatial scale of the 8/im emission of the Galactic 
plane and the computation time. Visual inspection of Spitzer im- 
ages suggests that most of the clouds are filamentary with a mi- 
nor axis which is not larger than a few arcminutes. The smooth- 
ing we have used is well matched to such clouds and our method 
will recover their exact structure. For clouds which are larger 
than the smoothing length, but which are centrally condensed, 
we will detect them but somewhat underestimate their opacity. 
On the other hand shallow large clouds will be missed (Section 5 
and 6). Using a larger smoothing length would allow us to better 
detect these large clouds, but at the cost of additional processing 
time and more significantly, the introduction of spurious artifi- 
cial clouds, especially where the background emission is weak. 
In any case, distinguishing between a feature due to a smooth 
lack of background emission or the presence of a large and low 
column density cloud requires observations of tracers in addition 
to the inferred mid-infrared extinction. We preferred to convolve 
the images with a Gaussian rather than using a median filter in 
order to better recover potential clouds adjacent to strong 8/im 
emitting structures. 



Having calculated 7 M ir we are able to compute both If ore and 
Ib g images (Section [3~!2"l ). Then using Eq. (|2]) we can construct the 
8//m opacity image, but before doing so, we smoothed the 8/im 
images with a 4" Gaussian in order to suppress high frequency 
noise. 

A series of artifacts, and spurious clouds may arise from our 
method. The first one comes from potentially interpreting ev- 
ery decrease in the 8//m emission on spatial scale smaller than 
~ 5' as being a potential cloud. This effect is especially impor- 
tant at high latitudes where the mid-infrared radiation field is 
low. In these regions a small decrease in the intensity will be in- 
terpreted as a stronger increase in the opacity than for a similar 
intensity drop in a high mid-infrared radiation field environment. 
Identifying such spurious clouds is difficult, and only follow-ups 
in other tracers in emission will give a definitive answer on the 
nature of these sources. However, we have attempted to min- 
imise such objects by selecting a relatively high opacity detec- 
tion threshold. 

Another artifact can arise in regions with strong intensity 
gradients in the initial 8/im block where the smoothing may arti- 
fially produce features identifie d as clouds, although r eal clouds 
also exist in these environments dDeharveng et al.l2009t) . To help 
identify possible spurious objects in regions of large 8/im in- 
tensity variations, our catalogue (Table [T]Q lists <5Imir, the nor- 
malised maximum variation of /mir within the IRDC and de- 
fined as (5/mir = (Z^iR - ^rV 7 mir- Our experience suggests 
that clouds with (5/mir > 0.5 have to be treated with caution. 
These clouds represent 14% of the total number of IRDCs in- 
cluded in our sample. Overall, after a visual inspection of every 
IRDC and the removal of obviously spurious IRDCs, we believe 
that more than 90% of the catalogued objects are true IRDCs. 

The tools to automatically construct the maps were mainly 
constructed using IDL packages. 



4. From 8/im opacities to column densities 

The images resulting from the analysis described above provide 
the spatial 8/im opacity distribution towards IRDCs. However 
a more useful quantity is the H2 column density distribution 
of these clouds. To convert 8/im opacities to H2 column den- 
sities requires a knowledge of the properties of the absorbing 
dust. Depending on the line of sight and on the structures ob- 
served e.g. diffuse material or dense material, the dust chemi- 
cal composition and thus, the dust properties, are different. In 
dense clouds like IRDCs, it is believed that dust grains are 
larger than in the diffuse interstellar medium due to coagula- 
tion and presen ce of icy mantles on the grains. This is sup- 
ported by ISO (LutzetalJ 19961). and more recen tly Spitzer 
(Indebetouw et al. I2005L Roman-Zuniga et al. 2007), observa- 
tions which have shown that towards dense clouds, the extinction 
cannot be fitted by a single p ower-law from the near-IR up to the 
mid-IR dDraine & Ledl984l) . The recent work has shown that in 
dense clouds the extinction decreases from the near infrared to 
~ 5/im and then reaches a plateau up to the silicate absorption 
band around 9/im. This behavior can be reproduced with dust 
models having R„ ^ 5 ( Weingart ner & Draind l2001). implying 
larger dust grains (compared to the commonly used value R v - 3 
for diffuse interstellar medium). 



this size corresponds to (pixel size)x2 8 



2 The full catalogue, including images of all the clouds 
are available online at: http://www.irdarkclouds.org or 
http://www.manchester.ac.uk/jodrellbank/sdc 
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Fig. 7. 8/rni opacity maps for the 3 IRDCs showed in Fig. [TJ The contours go from 0.4 to 0.8 in steps of 0.2 for the figures on the 
right and left, while for the middle figure the contours go from 0.4 to to 1.9 in steps of 0.3. 



Fo r the IRDCs we therefore adopt a value of M um /A v 
0.045 (llndebetouw et al.l 120051: iRoman-Zuniga et alJ 120071) . To 
convert to the molecular hydrogen column density, A^h, we adopt 

A„ = 10- 2I xJV H2 (6) 

from Bohlin et al. I (119781) . although the more recent work by 
iDraind (120031 7 based on the observations of iRachford et al.l 
(120021) . suggests a 50% larger column density per magnitude of 
extinction. To account for this, and other uncertainties, the col- 
umn densities in this (and subsequent papers), have been calcu- 
lated from the 8/iin optical depth adopting the relation 



T 8/im x3[+l]xl0 22 cm- 2 



(7) 



5. Identification of sources 



Once the opacity maps have been constructed, we need to ex- 
tract the information on the structures lying within them. For this 
purpose, we have developed a new code, largely inspired by th e 
CLUMPFIND source extraction code of IWilliams et all (11994 . 
The operation of the code is described in AppendixlAl The main 
differences compared to CLUMPFIND are how a source is de- 
fined and its properties determined. This new method does not 
assume that every pixel belongs to a source, but we define the 
boundaries of an object by the local minimum between closest 
neighbours. Then to estimate the size of the source we calculate 
the first and second order moments of the absorption distribu- 
tion, and then we diagonalise the second order moment matrix 
(AppendixlAl. 

5.1. IRDCs 

In our maps, the IRDCs have been defined as connected struc- 
tures lying above an opacity, r^ m , of 0.35 with a peak above 
0.7 and a diameter greater than 4". Therefore, using Eq. (0, 
these detection thresholds correspond to 1 x 10 22 cnT 2 and 
2x 10 22 crrT 2 , respectively. With these parameters, we have iden- 
tified 1 1303 IRDCs (see Fig.[7J». Table[TJlists the first 30 IRDCs, 
giving their name, coordinates, I m i n in MJy/sr, Imir in MJy/sr, 
<5Imir (see Sec. 3.3), AX the major axis size in arcseconds, AY 
the minor axis size in arcseconds, a the position angle in degrees 
( see Appendix lAl for an exact definition of these parameters), 
R eq the equivalent radius which corresponds to the radius of a 
disc having the same area as the IRDC in arcseconds, r pea k the 



8jum peak opacity, r av the 8yum opacity averaged over the cloud, 
r sat the saturation opacity as described in Section 3.2, the num- 
ber of fragments within the IRDC (Sec. |5.2t , whether there is a 
24/vm star in the field/IRDC or not (Sec. 15. 3b . and <x star the 24/^m 
stellar density around the IRDC in number of stars per arcminute 
squared. 

5.2. IRDC fragments 

Substructures are seen in almost every IRDC map (Fig. [7]). Since 
column density peaks likely pinpoint the sites of the formation of 
the next generation of stars, identifying these peaks is crucial in 
identifying the initial conditions of star formation in IRDCs. We 
call these substructures identified within the IRDCs fragments. 
We prefer this name, rather tha n for example, cores, as they have 
been called in other papers (e.g. lRathborne et al.l2 006). The term 
core has often been used to identify a substructure which forms 
one star or a small group of stars and we do not at this stage 
wish to imply any physical interpretation of these structures in 
IRDCs. Especially since we do not know the distance of the bulk 
of the IRDCs, we cannot infer any physical parameters such as 
the sizes and masses of the fragments/IRDCs. 

To extract the IRDC fragments, we apply the same extraction 
code used to identify the IRDCs (AppendixlAl. We applied dif- 
ferent values of T ste p in order to get a comprehensive picture of 
the fragmentation in these IRDCs. In total we identified 20000 
to 50000 fragments depending on T step (from 0.1 to 0.35). For 
each of these fragments we have measured their positions, sizes, 
peak and average opacity, and their 24/mi star association. As an 
indication of the degree of fragmentation Table [TJ includes the 
number of fragments extracted in each IRDC with r step = 0.35. 
The nature of these fragments is discussed in detail in Peretto & 
Fuller (2009, in preparation). 

5.3. 24fim point-like sources association 

In order to check for star formation activity associated with the 
IRDCs and fragments, we analysed the 24/mi MIPSGAL data, 
looking for point-like sources. For this purpose we used the IDL 
FIND task of the IDL Astronomy Library. As an initial indica- 
tion of the the star formation activity of these IRDCs, we have 
identified all the 24/im stars lying within a box (described as 
Field in Table[TJcol. 16) of twice the calculated extent along the 
coordinate axes of each IRDC. Doing so, we find that 32% of the 
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IRDCs do not have any 24pm point-like sources in such a boxQ. 
On the other hand, 20% of the IRDCs have a 24pm source lying 
within their boundaries (Table[T]col. 17). Therefore, the percent- 
age of active star forming IRDCs is likely to be between 20 and 
68%. A more detailed analysis of the stellar content of IRDCs 
will be presented in a following paper. 

Concerning the fragments, between 1% and 6% have stars 
lying within their boundaries, depending on the parameters used 
to extract the fragments (Peretto & Fuller 2009, in preparation). 

We have also calculated the 24pm stellar surface density 
around each IRDC extracted (Table|2]col. 18). This number pro- 
vides an idea of the crowding in the area around the IRDC. 

5.4. Uncertainties on the opacity estimates 

The main source of uncertainty in the opacity maps arises from 
the estimate of the foreground intensity /f ole . As explained in 
Section 3, we used the relation 7f ore = 0.54 x /mir to calculate 
this quantity for every cloud. However, as can be seen in Fig.[5]a 
dispersion of ~ 0.1 exists on this relation with a maximum vari- 
ation of +0.25. To assess the impact of such variations on the 
calculated peak opacities of the clouds we have computed for 
every cloud the ratio, K, of the peak opacity inferred assuming 
^foie - Cf X /mir where 0.25 < Cf < 0.75 to the peak opacity 
calculated with the fiducial 7f ore (Eq. |5j Cf — 0.54). Figure [8] 
shows the median value of this ratio as a function of Cf. For 
each value of Cf we also calculated the dispersion in K across 
the entire sample of clouds. These dispersions were all < 0.1, 
except for the case Cf — 0.75 where the dispersion in K reached 
0.3. The range in K shown on Fig.|8]provides an estimate of the 
peak opacity uncertainty related to the choice/variation of /f 0re . 
In most cases this uncertainty is less than a factor of 2, but can 
be as large as 10 for extreme cases. On the same figure we also 
plot the fraction of saturated clouds in function of the adopted 
7f ore . Naturally, the higher /f ore , the higher the number of satu- 
rated clouds, reaching 80% in the most extreme case, but being 
less than 10% for /f ore < 0.6Imir- In the case of Cf = 0.54, 
the percentage of saturated cloud is 3%. This is consistent with 
a visual inspection of the 8/iin intensity profiles of a sample of 
clouds which indicates that less than 10% of the objects show a 
flattening in their inner regions, a signature of possible satura- 
tion. 

Another source of uncertainty is the variation of the fore- 
ground intensity relative to the background emission. Since we 
have shown that on average the background emission is equal to 
the foreground emission (Sec. 13.2b . we assumed that the varia- 
tions of both quantities in front and behind a cloud have the same 
origin, and so, the same variations. However, this assumption 
could be wrong. For instance one could be constant over the ex- 
tent of the cloud, more likely the foreground, with the other one 
containing all the variations observed in the mid-infrared radia- 
tion field. The impact of such effects on the opacity estimate is 
similar to the one described above. Clouds with small variations 
in their mid-infrared radiation fields are thus better constrained 
than the ones with high <S/mir- 

As mentioned in the previous section large clouds (> 5') 
have opacities which are likely to be underestimated, how- 
ever this effect is minor compared with those mentioned above. 
Overall, considering all the factors which contribute to the uncer- 
tainty in opacity, we estimate the values derived from the Spitzer 

3 In Table Q] columns 16 and 17 y stands for yes and indicates the 
presence of a star within the field (and/or the cloud), while n indicates 
there are no such stars 
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Fig. 8. (top): Correction factor to apply to peak opacities in or- 
der to correct for different foreground intensities than the one 
we used in this study, (bottom): Fraction of saturated clouds as 
a function of the assumption made on the foreground intensity. 

data are uncertain by a factor of no more than two. This result 
is consistent with the observations of a subset of clouds in the 
1.2mm continuum emission from the dust (Fig.[6]i. 

6. Comparison with the MSX IRDC catalogue 

Simo n et all d2006al) undertook a systematic survey of IRDCs 
using MSX data. Their survey covers a larger area of the Galactic 
plane than our s due to the smalle r coverage of GLIMPSE sur- 
vey. In total, ISimon et al.l (|2006a) have extracted 6721 clouds 
between 10° < \l\ < 65° and -1° < b < 1°. For the same cov- 
erage we extracted 1 1303 Spitzer dark clouds, which is roughly 
twice as many. However, the detection limits, peak and bound- 
ary, in the two surveys are different, the simple comparison of 
the numbers of clouds provides only an incomplete comparison 
and so a more complete comparison has been performed. 

As illustrated by Fig.|9j it appears that a minority of IRDCs 
are common to both MSX and Spitzer catalogues. Actually, only 
20% of the Spitzer dark clouds appear in the MSX catalogue 
(corresponding to 25% of MSX clouds being associated with a 
Spitzer dark cloud). Based on this comparison we define 3 cate- 
gories of clouds: Spitzer only, which are clouds appearing only 
in our catalogue; MSX only, which are clouds appearing only 
in Simon et al. catalogue; and both, which are clouds appearing 
in both catalogues. Figure [10] shows an example of an IRDC in 
each of these categories. 

Of the Spitzer only clo uds, 5 1 % do not mee t the size crite- 
ria, R eq > 20", imposed by Simo n et all d2006al) to identify the 
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MSX dark clouds 



MSX IRDCs, explaining why they are not in the MSX catalogue. 
The remaining ~ 30% of Spitzer only IRDCs are the result from 
the difference in the method used to estimate the backgro und. 
Using a median filter of 30' diameter, ISimon et alj d2006al) un- 
derestimated the background almost everywhere in the inner 
< \b\ < 0.25° of the Galactic plane. As a consequence, the 
inferred background reaches a similar value as the IRDC itself, 
and therefore, an IRDC is not detected. This artifact can be seen 
when ploting the source fraction as a function of the Galactic lat- 
itude (Fig.fTTTi. We see a significant difference between the dis- 
tributions of MSX and Spitzer IRDCs. The MSX IRDCs have a 
rather flat distribution in a central 1 region whereas the Spitzer 
IRDC distribution has a clear central peak decreasing sharply on 
both sides of it. We believe than this difference arises from the 
difference in the background construction. 

On the other hand the MSX only clouds have very low con- 
trast (opacity peaks) and are particularly large. The detection of 
such clouds in the MSX data has been possible due to the large 
backgrou nd smoothing leng th, and the low contrast threshold 
used by ISimon et al . (2006a). In order to investigate this effect 
and see whether our method could recover these clouds when 
using a larger Gaussian, we smoothed the block shown in Fig [9] 
to 20', and performed the extraction of IRDCs on the resulting 
opacity map. Doing so, we find twice as many clouds (40%) 
which are in both catalogues, but in parallel 35% of Spitzer 
clouds which were initially detected using a smaller Gaussian 
are lost. The remaining MSX only clouds are just too shallow to 
be identified given the opacity threshold we used, 0.7. In addi- 
tion, looking at their 8^m emission it is not clear whether many 
of these clouds are real, or just a decrease in the background of 
the Galactic plane. 

Overall, we can say that 80% of our catalogue comprises 
IRDCs which were previously unknown and constitutes the most 
complete catalogue available of such objects with column den- 
sity peaks above 1 x 10 22 cm~ 2 . 

7. Summary 

This paper, the first of a series dedicated to the study of infrared 
dark clouds, describes the techniques developed to establish a 
complete catalogue of Spitzer dark clouds. We analysed the full 
data set of the 8yum GLIMPSE Galactic plane to look for IRDCs. 
We extracted 11303 of these clouds, obtaining column density 
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Fig. A.l. Illustration of our extraction method. This figure shows 
the opacity profile of a typical IRDC. The bottom dashed line 
shows the opacity threshold beneath which structures are ig- 
nored. The dotted lines show the different slices through the 
cloud, every slice being separated by T step . The upper dashed line 
shows the opacity corresponding to the local minimum , ri ev , be- 
tween the two local peaks shown on that plot. In such a cloud, 
our method would extract one IRDC (colored area) and two frag- 
ments (colored area + dashed-dotted lines) within it. 



maps for each of them, and characterizing their physical proper- 
ties. We also identify the substructures lying within these clouds, 
extracting up to ~ 50000 of these. Table [2] presents a summary 
of the average and range of properties of both the clouds and 
these substructures (fragments). The full table of the properties 
of the clouds and fragments plus images and opacity maps are 
available from an online databaseQ. In subsequent papers we will 
exploit the tremendous quantity of information concerning the 
initial conditions for the formation of stars in the Galaxy con- 
tained within this set of IRDC column density maps. 
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Appendix A: Method for extracting sources 

We developed a new code to extract sources from our opacity 
maps. The first part of our algorithm is mainly based on the 
same principle as the one developed by Willia ms et al] (jl994) 
for CLUMPFIND. We set two main parameters which are the 
lowest contour level under which we do not consider any struc- 
ture, Tthres, and a step in unit of the map, T step . Then we look 
at every local peak between two consecutive levels, up to the 
maximum of our image. The number of local peaks gives us the 
number of fragments we will extract from the image, unless the 
final estimated size is lower than the final angular resolution or 
that the amplitude between the peak of the fragment and its ex- 
ternal boundary is less than T step . Then we have to determine the 



4 The database is available at http://www.irdarkclouds.org or 
http://www.manchester.ac.uk/jodrellbank/sdc 
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Fig. 9. In grey scale is the Spitzer 8/im emission of one of the blocks we constructed around / 30°. The black circles indicate the 
position and size of the Spitzer IRDCs identified in this study, while the red square symbols code the position and size of the MSX 
IRDCs. We see on this image that the Spitzer IRDCs are more numerous where the background is stronger, while, quite surprisingly, 
this is n ot the case for the MSX IRDCs. The MSX clouds detected at \b\ > 0.5°, are on average the larger clouds in the lSimon et al.l 
(2006a) sample. For most of them, we do not detect any Spitzer IRDCs at these positions in our standard processing (using a 5' 
Gaussian) but some are detected when using a larger smoothing function (see text). 
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Fig. 10. Comparison of three IRDCs seen with Spitzer at 8/vm illustrating the 3 categories of IRDC based on their MSX and Spitzer 
detection. Note that the cloud detected only in the MSX catalogue (left panel) exhibits much lower extinction than the other two 
objects. 



pixels we associate to each local peak. For this, for every peak, 
we go down, level by level, and check if the local peak we are 
looking at is the only one in this contour. If yes, we look at the 
following contour and do the same job. If there is more than one 
local peak within the contour we look for the local minimum 
between these two peaks, ri ev , and the pixels lying above Ti ev 
and associated with the considered peak define the extent of the 
fragment. 



we estimate first the center of gravity of the cores, (Xcc, ^cg)> 
using 



CG 



xx. 



Z* 



Ycg = 



z* 



z* 



(A.l) 



In order to measure the size of the clouds and fragments, we 
did not want to assume any particular shape for the source. So, 
once we have identify all the pixels associated with a given peak, 



where V, is the value of the z'th pixel, jc, and i/, its coordinates, 
and N is the number of pixels. Then, we calculate the matrix of 
moment of inertia, I: 



Ixx *x. 



(A.2) 
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Fig. A.2. opacity map of the middle IRDC shown in Fig.Q] 
Our extraction method detected 7 fragments within this IRDCs 
when T st ep = 0.1. The black contours mark the ri ev value (bound- 
ary contour) for each fragment. The sizes and position angle 
are also given in between brackets. We can see that these val- 
ues give a reasonable description of the shape of the fragments 
(and IRDC) 

with 

N 

I*x = J] V ^ ~ y cc) 2 (A3) 

!=I 

N 

I m = -J]v i (x i -X cc ) 2 (A.4) 

N 

hy = Iyx = 2 Vfci - X cc)(yi ~ Ycc) (A.5) 
1=1 

Finally, we diagonalize I in order to obtain its two eigenval- 
ues and eigenvectors. From this we can easily calculate the posi- 
tion angle a of the major axis (given by the vector associated by 
the smallest eigenvalue). To estimate the sizes of the cores we 
calculate the following values: 

N 

cr\ = cos(a) - y t sin(a)] 

i=l 
N 

i=l 



The sizes are then estimated by AX — 2 x Ja^/N and AY 
2 X tJct 2 y /N 

The three values, AX, AY and a, are given for every IRDC in 
Table Q] 
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Table 1. SDC properties for the first 30 out of 1 1303 in the catalogue. The full catalogue is available online. The columns give a running number (1), the name of the source based 
in its Galactic coordinates (2), the right ascension and declination (in J2000) of the cloud peak (3,4), the minimum 8pm emission towards the cloud (7 m j n ) (5), the background 
8pm emission (/mir) (6), the maximum /mir variation within the IRDC ((5/mir, Sec. 3.3) (7), the size of the cloud along its major and minor axes in arcseconds (8,9), the position 
angle of the major axis of the cloud in degrees East of North (10), the equivalent radius {R eq ; Sec. 15. U of the cloud (11), the peak and average optical depth of the cloud at 8pm 
(12,13), the optical depth at 8pm at which the absorption would be saturated (14), the number of fragments in the cloud identified with T step = 0.35 (15; Sec. 5.2), whether there 
are 24pm stars in the field (16) and in the cloud (17; Sec. 15.3b ; and the density of stars around the cloud (18). 
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Table 2. Average properties of IRDCs and fragments (extracted with T step = 0.35). 



Structures 


Number of 


Req 




Aspect ratio 






fpeak 


Star association 




Objects 


Average 


Range 


Average Range 


Average Range 


Avera;; 


;e Range 








(arcsec) 


(arcsec) 










% 


IRDCs 


11303 


31 


4-374 


2.2 1.0-11.6 


0.15 0.01-2.35 


1.15 


0.70-8.36 


20-68 


Fragments 


19838 


19 


1-205 


2.0 1.0-11.6 


0.75 0.01-7.88 


1.63 


0.70-8.36 
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